software agent
Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Xia, Chunqiu Steven, Wang, Zhe, Yang, Yan, Wei, Yuxiang, Zhang, Lingming
Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are typically equipped with a suite of coding tools and can autonomously decide the next actions to form complete trajectories to solve end-to-end software tasks. While promising, they typically require dedicated design and may still be suboptimal, since it can be extremely challenging and costly to exhaust the entire agent scaffold design space. Recognizing that software agents are inherently software themselves that can be further refined/modified, researchers have proposed a number of self-improving software agents recently, including the Darwin-Gödel Machine (DGM). Meanwhile, such self-improving agents require costly offline training on specific benchmarks and may not generalize well across different LLMs or benchmarks. In this paper, we propose Live-SWE-agent, the first live software agent that can autonomously and continuously evolve itself on-the-fly during runtime when solving real-world software problems. More specifically, Live-SWE-agent starts with the most basic agent scaffold with only access to bash tools (e.g., mini-SWE-agent), and autonomously evolves its own scaffold implementation while solving real-world software problems. Our evaluation on the widely studied SWE-bench Verified benchmark shows that LIVE-SWE-AGENT can achieve an impressive solve rate of 77.4% without test-time scaling, outperforming all existing software agents, including the best proprietary solution. Moreover, Live-SWE-agent outperforms state-of-the-art manually crafted software agents on the recent SWE-Bench Pro benchmark, achieving the best-known solve rate of 45.8%.
Bridging Literature and the Universe Via A Multi-Agent Large Language Model System
Zhang, Xiaowen, Bi, Zhenyu, Lachance, Patrick, Wang, Xuan, Di Matteo, Tiziana, Croft, Rupert A. C.
As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > Virginia (0.04)
- (3 more...)
Agentic Business Process Management: Practitioner Perspectives on Agent Governance in Business Processes
Vu, Hoang, Klievtsova, Nataliia, Leopold, Henrik, Rinderle-Ma, Stefanie, Kampik, Timotheus
With the rise of generative AI, industry interest in software agents is growing. Given the stochastic nature of generative AI-based agents, their effective and safe deployment in organizations requires robust governance, which can be facilitated by agentic business process management. However, given the nascence of this new-generation agent notion, it is not clear what BPM practitioners consider to be an agent, and what benefits, risks and governance challenges they associate with agent deployments. To investigate how organizations can effectively govern AI agents, we conducted a qualitative study involving semi-structured interviews with 22 BPM practitioners from diverse industries. They anticipate that agents will enhance efficiency, improve data quality, ensure better compliance, and boost scalability through automation, while also cautioning against risks such as bias, over-reliance, cybersecurity threats, job displacement, and ambiguous decision-making. To address these challenges, the study presents six key recommendations for the responsible adoption of AI agents: define clear business goals, set legal and ethical guardrails, establish human-agent collaboration, customize agent behavior, manage risks, and ensure safe integration with fallback options. Additionally, the paper outlines actions to align traditional BPM with agentic AI, including balancing human and agent roles, redefining human involvement, adapting process structures, and introducing performance metrics. These insights provide a practical foundation for integrating AI agents into business processes while preserving oversight, flexibility, and trust.
- Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Hawaii (0.04)
- (6 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (0.66)
- Research Report > New Finding (0.46)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.54)
Every Software as an Agent: Blueprint and Case Study
The rise of (multimodal) large language models (LLMs) has shed light on software agent -- where software can understand and follow user instructions in natural language. However, existing approaches such as API-based and GUI-based agents are far from satisfactory at accuracy and efficiency aspects. Instead, we advocate to endow LLMs with access to the software internals (source code and runtime context) and the permission to dynamically inject generated code into software for execution. In such a whitebox setting, one may better leverage the software context and the coding ability of LLMs. We then present an overall design architecture and case studies on two popular web-based desktop applications. We also give in-depth discussion of the challenges and future directions. We deem that such a new paradigm has the potential to fundamentally overturn the existing software agent design, and finally creating a digital world in which software can comprehend, operate, collaborate, and even think to meet complex user needs.
The AI-Powered Future of Coding Is Near
I am by no means a skilled coder, but thanks to a free program called SWE-agent, I was just able to debug and fix a gnarly problem involving a misnamed file within different code repositories on the software-hosting site GitHub. I pointed SWE-agent at an issue on GitHub and watched as it went through the code and reasoned about what might be wrong. It correctly determined that the root cause of the bug was a line that pointed to the wrong location for a file, then navigated through the project, located the file, and amended the code so that everything ran properly. It's the kind of thing that an inexperienced developer (such as myself) might spend hours trying to debug. Many coders already use artificial intelligence to write software more quickly.
User-Like Bots for Cognitive Automation: A Survey
Gidey, Habtom Kahsay, Hillmann, Peter, Karcher, Andreas, Knoll, Alois
Software bots have attracted increasing interest and popularity in both research and society. Their contributions span automation, digital twins, game characters with conscious-like behavior, and social media. However, there is still a lack of intelligent bots that can adapt to the variability and dynamic nature of digital web environments. Unlike human users, they have difficulty understanding and exploiting the affordances across multiple virtual environments. Despite the hype, bots with human user-like cognition do not currently exist. Chatbots, for instance, lack situational awareness on the digital platforms where they operate, preventing them from enacting meaningful and autonomous intelligent behavior similar to human users. In this survey, we aim to explore the role of cognitive architectures in supporting efforts towards engineering software bots with advanced general intelligence. We discuss how cognitive architectures can contribute to creating intelligent software bots. Furthermore, we highlight key architectural recommendations for the future development of autonomous, user-like cognitive bots.
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- South America > Colombia > Bolivar Department > Cartagena (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (7 more...)
- Research Report (1.00)
- Overview (0.86)
- Information Technology (0.93)
- Health & Medicine > Therapeutic Area > Neurology (0.70)
- Government > Military (0.66)
- Health & Medicine > Consumer Health (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.94)
Systematic Comparison of Software Agents and Digital Twins: Differences, Similarities, and Synergies in Industrial Production
Reinpold, Lasse Matthias, Wagner, Lukas Peter, Gehlhoff, Felix, Ramonat, Malte, Kilthau, Maximilian, Gill, Milapji Singh, Reif, Jonathan Tobias, Henkel, Vincent, Scholz, Lena, Fay, Alexander
To achieve a highly agile and flexible production, it is envisioned that industrial production systems gradually become more decentralized, interconnected, and intelligent. Within this vision, production assets collaborate with each other, exhibiting a high degree of autonomy. Furthermore, knowledge about individual production assets is readily available throughout their entire life-cycles. To realize this vision, adequate use of information technology is required. Two commonly applied software paradigms in this context are Software Agents (referred to as Agents) and Digital Twins (DTs). This work presents a systematic comparison of Agents and DTs in industrial applications. The goal of the study is to determine the differences, similarities, and potential synergies between the two paradigms. The comparison is based on the purposes for which Agents and DTs are applied, the properties and capabilities exhibited by these software paradigms, and how they can be allocated within the Reference Architecture Model Industry 4.0. The comparison reveals that Agents are commonly employed in the collaborative planning and execution of production processes, while DTs typically play a more passive role in monitoring production resources and processing information. Although these observations imply characteristic sets of capabilities and properties for both Agents and DTs, a clear and definitive distinction between the two paradigms cannot be made. Instead, the analysis indicates that production assets utilizing a combination of Agents and DTs would demonstrate high degrees of intelligence, autonomy, sociability, and fidelity. To achieve this, further standardization is required, particularly in the field of DTs.
- Information Technology (1.00)
- Energy > Oil & Gas > Upstream (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Smarter AI Assistants Could Make It Harder to Stay Human
Researchers and futurists have been talking for decades about the day when intelligent software agents will act as personal assistants, tutors, and advisers. Apple produced its famous Knowledge Navigator video in 1987. I seem to remember attending an MIT Media Lab event in the 1990s about software agents, where the moderator appeared as a butler, in a bowler hat. With the advent of generative AI, that gauzy vision of software as aide-de-camp has suddenly come into focus. WIRED's Will Knight provided an overview this week of what's available now and what's imminent.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)
Towards Cognitive Bots: Architectural Research Challenges
Gidey, Habtom Kahsay, Hillmann, Peter, Karcher, Andreas, Knoll, Alois
Software bots operating in multiple virtual digital platforms must understand the platforms' affordances and behave like human users. Platform affordances or features differ from one application platform to another or through a life cycle, requiring such bots to be adaptable. Moreover, bots in such platforms could cooperate with humans or other software agents for work or to learn specific behavior patterns. However, present-day bots, particularly chatbots, other than language processing and prediction, are far from reaching a human user's behavior level within complex business information systems. They lack the cognitive capabilities to sense and act in such virtual environments, rendering their development a challenge to artificial general intelligence research. In this study, we problematize and investigate assumptions in conceptualizing software bot architecture by directing attention to significant architectural research challenges in developing cognitive bots endowed with complex behavior for operation on information systems. As an outlook, we propose alternate architectural assumptions to consider in future bot design and bot development frameworks.
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
- North America > United States (0.04)
Agent-Cells with DNA Programming: A Dynamic Decentralized System
This paper introduces a new concept. We intend to give life to a software agent. A software agent is a computer program that acts on a user's behalf. We put a DNA inside the agent. DNA is a simple text, a whole roadmap of a network of agents or a system with details. A Dynamic Numerical Abstract of a multiagent system. It is also a reproductive part for an \emph{agent} that makes the agent take actions and decide independently and reproduce coworkers. By defining different DNA structures, one can establish new agents and different nets for different usages. We initiate such thinking as \emph{DNA programming}. This strategy leads to a new field of programming. This type of programming can help us manage large systems with various elements with an incredibly organized customizable structure. An agent can reproduce another agent. We put one or a few agents around a given network, and the agents will reproduce themselves till they can reach others and pervade the whole network. An agent's position or other environmental or geographical characteristics make it possible for an agent to know its active set of \emph{genes} on its DNA. The active set of genes specifies its duties. There is a database that includes a list of functions s.t. each one is an implementation of what a \emph{gene} represents. To utilize a decentralized database, we may use a blockchain-based structure. This design can adapt to a system that manages many static and dynamic networks. This network could be a distributed system, a decentralized system, a telecommunication network such as a 5G monitoring system, an IoT management system, or even an energy management system. The final system is the combination of all the agents and the overlay net that connects the agents. We denote the final net as the \emph{body} of the system.
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hong Kong (0.04)
- Telecommunications (0.87)
- Energy > Power Industry (0.49)